Effectiveness of exercise intervention during pregnancy on high-risk women for gestational diabetes mellitus prevention: A meta-analysis of published RCTs

Objective We aimed at investigating the preventive role of exercise intervention during pregnancy, in high-risk women for gestational diabetes mellitus (GDM). Materials and methods We searched PubMed, CENTRAL, and Scopus for randomized controlled trials (RCTs) that evaluated exercise interventions during pregnancy on women at high risk for GDM. Data were combined with random effects models. Between study heterogeneity (Cochran’s Q statistic) and the extent of study effects variability [I2 with 95% confidence interval (CI)] were estimated. Sensitivity analyses examined the effect of population, intervention, and study characteristics. We also evaluated the potential for publication bias. Results Among the 1,508 high-risk women who were analyzed in 9 RCTs, 374 (24.8%) [160 (21.4%) in intervention, and 214 (28.1%) in control group] developed GDM. Women who received exercise intervention during pregnancy were less likely to develop GDM compared to those who followed the standard prenatal care (OR 0.70, 95%CI 0.52, 0.93; P-value 0.02) [Q 10.08, P-value 0.26; I2 21% (95%CI 0, 62%]. Studies with low attrition bias also showed a similar result (OR 0.70, 95%CI 0.51, 0.97; P-value 0.03). A protective effect was also supported when analysis was limited to studies including women with low education level (OR 0.55; 95%CI 0.40, 0.74; P-value 0.0001); studies with exercise intervention duration more than 20 weeks (OR 0.54; 95%CI 0.40, 0.74; P-value 0.0007); and studies with a motivation component in the intervention (OR 0.69, 95%CI 0.50, 0.96; P-value 0.03). We could not exclude large variability in study effects because the upper limit of I2 confidence interval was higher than 50% for all analyses. There was no conclusive evidence for small study effects (P-value 0.31). Conclusions Our study might support a protective effect of exercise intervention during pregnancy for high-risk women to prevent GDM. The protective result should be corroborated by large, high quality RCTs.


Introduction
Gestational diabetes mellitus (GDM) is a multifactorial disorder from the interaction between genetic and environmental risk factors. It is characterized by insulin resistance and decreased pancreatic b-cell function. It is also a risk factor for the future development of type 2 diabetes mellitus [1], and one of the most common diseases during pregnancy [2]. The worldwide prevalence is increasing ranging between 2 and 14% [3]. Women with GDM have an increased risk of obstetric, fetal, neonatal, maternal, and child complications [3][4][5][6][7][8][9][10][11].
Previous studies on the effect of exercise intervention were conflicting [4,5,13,17]. Several systematic reviews and meta-analyses [4,6,7,9,17] showed a significant risk reduction among women in the general population while other studies [8,10,18] failed to support risk reduction for GDM. Two recent meta-analyses explored the role of exercise on GDM prevention among high-risk women. One meta-analysis [19], showed no benefit of the interventions, including exercise, compared to placebo; while the other [20] supported a significant GDM risk reduction with exercise during pregnancy among overweight and obese women. However, to our knowledge, there was no systematic approach to evaluate exercise as a single intervention during pregnancy on GDM prevention among high-risk women with any of the risk factors for GDM, and who already received standard prenatal care.
Our study aimed at systematically appraise RCTs that assessed the effectiveness of exercise during pregnancy on the prevention of GDM. We included RCTs on high-risk pregnant women with one or multiple risk factors, which compared exercise to standard prenatal care. We performed meta-analysis with special emphasis on issues of potential biases, and sources of study heterogeneity including both clinical and methodological factors that may account for potential variability in study effects.

Materials and methods
Our study was registered in the Open Science Framework (OSF) (Registration DOI 10.17605/ OSF.IO/23NJS, https://archive.org/details/osf-registrations-23njs-v1). This systematic review was performed according to PRISMA extension for complex interventions guideline [21]. keywords related to exercise, physical activity, and GDM combined with the Cochrane Collaboration search algorithm for RCTs. We conducted a systematic search on Scopus using the same keywords after excluding articles registered in Pubmed. Finally, we searched CENTRAL including the same keywords related to exercise, physical activity, and GDM. Search algorithms were described in detail in S1 Table. Electronic searches were supplemented by perusal of the references of the retrieved papers as well as the references of review articles. One investigator (GIT) screened all databases. For items considered potentially eligible or unclear, after screening the title and/or abstract, the full text was retrieved. A second investigator (AT) checked on the items that the first investigator (GIT) could not decide. Discrepancies were resolved through consensus. For trials that we could not reach a final decision, or the full text could not be retrieved, we contacted investigators when an e-mail address was available. Two consecutive reminders were also sent to non-responders.

Eligibility criteria
We selected trials according to PICO (population, intervention, comparator, and outcome) approach. We accepted randomized controlled trials (RCTs) in English that recruited pregnant women at high risk for GDM. Factors that increased pregnant women's risk included at least one of the following: increased BMI [1,[4][5][6][9][10][11][12][13], sedentary lifestyle [10], family history [4,10,22], previous macrosomia [22], unbalanced diet [3,10], previous GDM [22], non-white ethnicities [2,4,10,14,22] and age > 25 years [22]. We considered as eligible trials that assessed interventions of any type of exercise during pregnancy. We accepted trials if women in the comparator group received the standard antenatal care. We considered as eligible trials that reported as outcome the onset of GDM. We accepted all modalities for GDM diagnosis. In case of multiple publications of an RCT with results in different follow-up periods, we accepted the publication including the largest sample. We excluded RCTs that were published at the protocol stage, pilot, or feasibility studies, abstracts from conference proceedings, and RCTs that did not report results on the eligible outcome.

Data extraction
Two independent researchers (GIT and KP) extracted the data. Discrepancies were resolved with consensus, and the participation of a third arbitrator (AT) where necessary. The Cohen kappa coefficient with 95% confidence interval (95% CI) was used to evaluate the agreement between the two investigators who independently extracted the data.
Extracted items included the name of first author, year of publication, country, whether the study was a cluster RCT, number of participating centers, study duration, drop-out rate, sample size, factors related to high risk for GDM in the participants, women's mean age, and number of participating women with low level of education if reported. We also recorded the type of intervention and the care that women in the comparator group received. For assessing the completeness of exercise intervention reporting, we used the CERT (Consensus on Exercise Reporting Template) tool for complex interventions [23]. CERT was proposed to improve reporting of exercise intervention programs in clinical trials. It included 16 items allocated in 7 categories, i.e., materials, provider, delivery, location, dosage, tailoring, and to what extent the exercise intervention was delivered and performed as planned [23]. In addition, we extracted potential side-effects /adverse events that were reported for intervention, and comparator arm. Finally, we recorded the number of GDM events as the outcome, separately in the experimental and the control arm. We also captured information on the method used in each study for the diagnosis of GDM.

Quality assessment of the studies and rating of overall evidence
We used the risk of bias tool proposed by the Cochrane Collaboration [24] for quality assessment of eligible RCTs. Two independent researchers (GIT and KP) extracted the data on quality assessment. Discrepancies were resolved with consensus, and the participation of a third arbitrator (AT) where necessary. In addition, we used the Grading of Recommendations, Assessment, Development and Evaluation tool (GRADE) for rating the overall evidence [25] (GRADEpro, Version 3.6.1. McMaster University, 2011)".

Statistical analysis
To combine the events of GDM, we performed both fixed effects and random effects model (REM) meta-analyses. In case that large heterogeneity could not be excluded, we reported the REM results (odds ratio with 95% CI) [26]. Heterogeneity was evaluated with Cochran's Q statistic (statistically significant for P < 0.10); and it was quantified with the I 2 metric (low, moderate, large, very large for values of <25, 25-49, 50-74, >75%, respectively) [27]. The main analyses included all available data. We performed separate analyses limited to studies where increased BMI was included in as a risk factor for GDM, and studies that did not consider BMI; studies where the percentage of participating women with low level education was more than 5%; studies that evaluated an intervention delivered individually, and studies that evaluated an intervention delivered in a group; trials that included a motivation component in the intervention, and trials that did not include motivation; studies with an intervention duration more than 20 weeks, and studies with an intervention duration up to 20 weeks. We also performed meta-regression analyses on GDM OR. The effect of baseline risk, and study duration were included individually as covariates in the meta-regressions. For each meta-regression, the slope coefficient with the standard error (SE), the permutation-based P-value (as suggested by Higgins and Thompson [28] and the tau 2 were reported. Publication bias was evaluated via the visual analysis of funnel plot, showing a symmetrical inverted funnel in the absence of bias [29]. To further investigate potential asymmetry due to publication bias, we performed the statistical Egger's test [30]. We also performed separate analyses for studies with low detection bias (studies reporting blinding of outcome assessors); and for studies with low attrition bias (studies with less than 20% of participants lost in follow-up). The level of significance for all analyses, except for Cochran's Q statistic, was set at P-value < 0.05. For our analyses, we used

Eligible studies
Our search yielded 1566 items (582 in PubMed, 290 in Scopus, and 694 in CENTRAL). We excluded 268 as duplicated. Out of the 1298 remaining items, we excluded 1260 as non-relevant based on the title, or abstract. Thus, we retrieved 37 papers in full text. Out of the 38 articles, we excluded 29; one paper reported a pilot study; 8 studies did not include an eligible population; 7 studies included a non-eligible intervention; and 13 trials did not report the onset of GDM as an outcome. Finally, we included 9 published RCTs as eligible for our study (Fig 1).

Characteristics of eligible studies
Eligible studies were published from 2012 to 2017. Four RCTs [13,[31][32][33] were conducted in Europe (one study in Netherlands, one in Spain, one in Norway, and another one in Ireland);

PLOS ONE
Exercise intervention in gestational diabetes mellitus two studies was conducted in Oceania [34,35], (one study in New Zealand, and one in Australia), two in USA [2,36], and another one [5] in China (Table 1). All trials used the participant as the randomization unit and had a parallel design. One study [31] was multi-centered (five participating centers). The duration of the trials ranged from 19 to 60 months. The drop-out rate was < 20% for all studies, except for one study [36] that was 31.9% (Table 1).
A total of 1,738 (866 in intervention, and 872 in control group) high-risk women for GDM participated in the eligible trials. Six studies included overweight, and obesity as risk factors [2,5,13,31,32,34]. Additional risk factors included history of GDM in three studies [2,31,35], history of type 1 and 2 diabetes mellitus in first-and second-degree relatives in two studies [2,31], history of macrosomia in one study [31], and previously sedentary lifestyle in two studies [33,36] (Table 2). Mean age ranged from 24.9 to 37.7 years for women in the intervention group, and from 20.3 to 37.7 years for women in the control group (Table 2). One study [2] reported only the age range (18 to 40 years) ( Table 2). Percentage of women with low education level ranged from 2% to 34% in the intervention group, and from 7% to 34.7% in the control group. Four studies [32,[34][35][36] did not report data on participants' education level ( Table 2).

Reporting of exercise intervention in eligible studies
Based on CERT [23], we captured the number of studies with inadequate reporting of the description of the exercise intervention (S2 Table). There was no trial that provided with adequate information for exercise reproduction. Six [2,5,[31][32][33]36] out of nine papers did not provide any information on the content of home program component. Six trials [5,13,31,[33][34][35] did not adequately report on non-exercise components. Generally, all trials provided information on the exercise components, and the necessary equipment; on the provider, and the supervision of the intervention; as well as on adherence, on potential side-effects /adverse events, and on dosage.

Effectiveness and safety of exercise during pregnancy
GDM was diagnosed by measuring fasting blood glucose, hemoglobin A1c, or by an oral glucose tolerance test (Table 4). A total of 374 (24.8%) [160 (21.4%) in intervention, and 214 (28.1%) in control group] developed GDM among the 1,508 high-risk women analysed for GDM outcome (Fig 2). When the nine trials were combined, there was no between study heterogeneity (Q 10.08, P-value 0.26). However, we could not exclude large variability (upper limit for 2 > 50%) in study effects due to real study differences [I 2 21% (95%CI 0, 62%)]. Thus, random effects estimates would be more appropriate for data synthesis and fixed effects estimates were not presented. Women who received exercise during pregnancy were on average less likely to develop GDM compared to women who followed only the standard prenatal care (OR 0.70, 95%CI 0.52, 0.93; P-value 0.02) (Fig 2). The summary odds ratio showed also a significant effect when analyses were limited to studies with more than 5% of the participating women reporting a low education level (OR 0.55, 95%CI 0.40, 0.74; P-value 0.0001); studies reporting the use of a motivation component in the intervention (OR 0.69, 95%CI 0.50, 0.96; P-value 0.03); and studies that evaluated an intervention with duration more than 20 weeks (OR 0.54, 95%CI 0.40, 0.74; P-value 0.0001). However, the test of difference was significant only for the subgroup analysis based on exercise duration (studies with exercise duration more than 20 weeks vs. studies with duration up to 20 weeks; P-value = 0.02) (S3 Table). In sensitivity analyses, the summary odds ratio remained statistically significant when studies were limited to those with a low attrition bias (OR 0.70, 95%CI 0.51, 0.97; P-value 0.03). We could not exclude large variability in study effects due to real study differences for all subgroup and sensitivity analyses (S3 Table). Thus, even statistically significant effects should be interpreted with caution because the true differences in effects across studies might be due to unidentified or unexplained underlying factors. Metaregression analyses with baseline risk, and study duration as covariates did not show a statistically significant effect on the summary OR (S4 Table). Pregnancy-induced hypertension was the most frequently reported adverse event. Four trials [13,31,32,34] reported that there was no adverse event ( Table 4).

Quality of reporting, potential bias, and quality of evidence
There was good agreement between the two independent researchers [Cohen k 91.4% (95% CI 82.8%, 100%; P-value <0.001)]. Based on the overall risk, four trials were judged to raise some concerns because they failed to report specific quality domains. Specifically, two trials [2,35] did not provide information on participants and personnel blinding, and on blinding of outcome assessors; and two studies [13,31] did not provide information on participants and personnel blinding only. Three RCTs were judged to be at high risk of bias. One of them [36] did not provide information on participants and personnel blinding, and on blinding of outcome assessors; in addition, it reported a drop-out rate at 31.9%. The other two trials [5,32] were unblinded for participants and personnel; one of them [5] was also unblinded for outcome assessors (Table 5).
Based on the funnel plot assessment, there was variation in the standard error of the studies. However, small studies were reasonably closely distributed around the summary effect estimate [29] (Fig 3). Egger's test of small study effects had a P-value of 0.31, and thus, it was not fully conclusive.
Five out of the 9 studies had potential performance, detection, or attrition bias. The other 4 studies were unclear about blinding. Overall, moderate quality of evidence showed that exercise during pregnancy for the population of women with high risk for GDM may have benefit when compared to standard prenatal care in reducing the risk of GDM (S5 Table).

PLOS ONE
Exercise intervention in gestational diabetes mellitus

Discussion
Our study showed that on average an exercise intervention during pregnancy may have a beneficial effect in preventing high-risk pregnant women from developing GDM. There was no significant between study heterogeneity. However, we noticed that a large variability in study effects could not be excluded. A potential beneficial effect was also supported when analyses were limited to studies with more than 5% of the participating women reporting a low education level; studies reporting the use of a motivation component in the intervention; and studies that evaluated an intervention with duration more than 20 weeks. Subgroup and sensitivity analyses did not identify a clinical or methodological factor that may explain for the potential large variability.
Our meta-analysis supported the possibility that specific exercise programs during pregnancy may decrease the GDM incidence. Exercise programs should follow guidelines for designing complex interventions [21]. Based on CERT [23], reporting of several intervention characteristics was missing. The study [5] with a significant decrease in GDM incidence did not provide data on whether the intervention was in group or applied individually; on any motivation strategies, on the content of home exercise, and on other non-exercise components; and on whether the exercise intervention was individually tailored or not. Thus, it may not be feasible for this intervention to be reproduced in future trials. Previous studies on complex interventions also showed inadequate reporting [4, 6-10, 17, 18].
Other interventions such as diet, supplements, and medications were evaluated for GDM prevention. Results regarding these outcomes also need to be scrutinized. Some of these interventions may be important but spurious effects due to various biases may be affecting these trials as well. For general population, some meta-analyses assessing exercise interventions with or without a diet component showed also statistically significant GDM risk reduction [4,6,7,9,17]. However, other studies [8,10,18,19] did not support similar results. In line with our study, a meta-analysis that evaluated exercise among overweight or obese women showed a reduction in GDM incidence [20]. Another meta-analysis [37] that evaluated the effect of different types of exercise and metformin for pregnancy outcomes in overweight and obese pregnant women, showed a reduced risk for GDM with aerobic exercise. However, our subgroup analysis limited to studies that included high-risk women based on the BMI criterion, did not show a significant effect of exercise intervention on GDM incidence. Compared to previous studies, our meta-analysis followed a more pragmatic approach for population eligibility including not only studies with pregnant women with increased BMI but also studies with pregnant women with other risk factors for GDM. Based on our subgroup analysis, future research on exercise interventions with adequate duration among pregnant women might be promising. However, this result should be interpreted cautiously since a large variability in study effects could not be excluded.
Several modifiable factors as well as non-modifiable factors may contribute to GDM. Obese women had twice the risk for GDM as compared to women with normal body weight [5]. Elevated pre-pregnancy BMI is associated with complications during pregnancy, regardless of GDM onset [1,14,16]. A cost-effectiveness study [38] showed that promoting healthy eating and physical activity was the preferred strategy for limiting weight gain during pregnancy. However, the exact intervention components that may lead to clinically significant risk reduction are yet to be determined. Previous studies supported that women during pregnancy showed low motivation to change their lifestyle [11]. Our subgroup analysis limited to studies that included a motivation component in the intervention also supported a significant effect of exercise intervention on GDM incidence. Therefore, exercise interventions may include a behavior change component. They may also address social determinants of health including education level to improve literacy, and access to health care services in addition to biological factors, and the right timing for women to start the intervention to prevent GDM.
Our findings may support on average a protective effect of exercise intervention during pregnancy for GDM prevention among women with GDM risk factors. However, both for the main analysis and for the subgroup and sensitivity analyses, potential large true differences in effects among studies could not be excluded. There may be additional unidentified or unexplained underlying factors that may account of the differences in effects. Future large, good quality trials recruiting pregnant women of low education level and evaluating an exercise intervention with satisfactory duration need to adequately report on the intervention characteristics to allow for evaluating potential frequency-response relationship between exercise and GDM risk reduction [12]. Additionally, they need to provide with adequate description of the exercise intervention programs for their accurate reproduction [23]. Motivation techniques for participants to complete the intervention, intensive monitoring to minimize losses to follow up that are not due to miscarriage, premature delivery, or fetal death in utero, and procedures that enhance fidelity are prerequisites for adequately implementing exercise interventions. Additional efforts to ensure blindness both of participants and researchers are imperative to support robustness of the results. In previous trials, investigators found it difficult to double-blind RCTs due to the nature of the intervention [3,11]. Previous results on diet interventions to prevent GDM were also heterogeneous [12,13]. Future trials assessing interventions including multiple components, i.e., diet, exercise, behavioral counseling, and social support, are needed to provide with definitive answers on their benefit and sustainability [10,18].
Our study had several limitations. We included only studies that evaluated interventions initiated during pregnancy; therefore, our findings cannot be generalized to exercise interventions that may begin before pregnancy. However, this limited the heterogeneity of the duration of intervention among studies. We included RCTs that recruited only high-risk women; and therefore, our results cannot be generalized to general population. However, we considered as high-risk not only women who were overweight or obese but also women with other risk factors including ethnicity, medical, and family history, and sedentary lifestyle. By broadening the criteria, we tried to achieve a pragmatic approach of the population included in our work. Searching for grey literature might have identified additional studies; however, unpublished results would still have remained unknown.

Conclusion
As a conclusion, our study may support a beneficial effect of exercise interventions during pregnancy in addition to standard antenatal care for preventing GDM among high-risk women. Furthermore, a protective effect for specific population subgroups, i.e., women with low education level, and for interventions with specific characteristics, i.e., with more than 20 weeks duration, and with motivational strategies cannot be excluded. Future large, good quality studies focusing on specific women populations, and evaluating interventions with adequate duration are necessary.
Supporting information S1